feat(benches): PR-11 — rmg-benches crate skeleton (Closes #42) #112

flyingrobots · 2025-11-02T09:35:23Z

Adds a minimal benches crate with Criterion and a motion throughput benchmark using public rmg-core APIs.

Why

Establish a place for performance microbenches as part of M1 (Golden Tests) before wiring JSON artifacts and regression gates (next PRs).

What changed

New crate: crates/rmg-benches
- dev-dep: criterion = 0.5 with html_reports
- benches/motion_throughput.rs: builds a demo engine, inserts N motion entities, applies the motion rule, and commits; sanity-decodes the first payload.
Workspace: added rmg-benches to Cargo workspace.
Docs: execution-plan + decision-log entries; rollup regenerated.

How to run

cargo bench -p rmg-benches

Scope

Benches-only; no runtime changes.

Follow-ups

PR-12 (Closes Snapshot hash microbench #43): snapshot hash microbench
PR-13 (Closes Scheduler drain microbench #44): scheduler drain microbench
PR-14 (Closes JSON report + CI upload #45): JSON artifact + upload
PR-15 (Closes Regression thresholds gate #46): regression thresholds gate

…y refresh cadence; regen rollup

… fmt via hook

coderabbitai · 2025-11-02T09:35:31Z

Warning

Rate limit exceeded

@flyingrobots has exceeded the limit for the number of commits or files that can be reviewed per hour. Please wait 19 minutes and 29 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

📥 Commits

Reviewing files that changed from the base of the PR and between 4ebdf8d and 73c4d9c.

⛔ Files ignored due to path filters (1)

README.md is excluded by !*.md

📒 Files selected for processing (5)

Cargo.toml (2 hunks)
crates/rmg-benches/benches/motion_throughput.rs (1 hunks)
docs/.vitepress/config.ts (1 hunks)
docs/decision-log.md (2 hunks)
docs/echo-total.md (5 hunks)

Summary by CodeRabbit

Release Notes

Chores
- Added benchmarking infrastructure for performance testing and monitoring of motion throughput operations.
- Updated project roadmap with clarified milestone targets and descriptions.
- Modernized build configuration for license compliance.

Walkthrough

Adds a new benchmarking crate (crates/rmg-benches) to the workspace, introduces a bench profile in Cargo.toml, adds a Criterion motion-throughput benchmark, and updates documentation and cargo-deny configuration to include the new crate and modernization notes. (≤50 words)

Changes

Cohort / File(s)	Summary
Workspace & Profiles `Cargo.toml`	Added `crates/rmg-benches` to `[workspace]` members and introduced a new `[profile.bench]` with `opt-level = 3`, `lto = false`, `codegen-units = 16`, `debug = true`.
Benchmark Crate Manifest `crates/rmg-benches/Cargo.toml`	New crate manifest for `rmg-benches` (v0.1.0, edition 2021, publish = false) with dev-dependency on `criterion` and a `[[bench]]` entry `motion_throughput` (harness = false).
Benchmark Implementation `crates/rmg-benches/benches/motion_throughput.rs`	New Criterion benchmark: `build_engine_with_n_entities(n)` helper and `bench_motion_apply` group measuring motion-apply throughput across entity counts [1,10,100,1000] using batched iterations and transaction apply/commit semantics.
Docs & Logs `docs/ROADMAP.md`, `docs/decision-log.md`, `docs/echo-total.md`, `docs/execution-plan.md`	Added entries documenting PR-11 benches crate, milestone clarifications (M1–M4), CI/rollup notes, and decision-log entries about benches and cargo-deny adjustments.
Policy Config `deny.toml`	Removed deprecated `copyleft`/`unlicensed` keys and added comments about cargo-deny modernization; retained explicit allow list and confidence settings.

Sequence Diagram(s)

sequenceDiagram
    participant Criterion
    participant Bench as motion_throughput
    participant Engine
    participant EngineAPI

    rect rgb(230,245,255)
    note over Criterion,Bench: Per-iteration batching (setup vs measured)
    Criterion->>Bench: iter_batched(setup_fn, bench_fn)
    end

    rect rgb(240,250,240)
    note over Bench,Engine: Setup: construct demo engine & labels
    Bench->>Engine: build_engine_with_n_entities(n)
    Engine-->>Bench: (engine, labels)
    end

    rect rgb(255,245,230)
    note over Bench,EngineAPI: Measured: start txn, apply rule per entity, commit
    Bench->>EngineAPI: start_transaction()
    loop for each entity label
        EngineAPI->>EngineAPI: apply(MOTION_RULE_NAME)
        EngineAPI-->>Bench: Applied | NoMatch
    end
    Bench->>EngineAPI: commit_transaction()
    Bench->>EngineAPI: decode_payload(first_entity)
    end

    Bench-->>Criterion: record throughput

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Files needing extra attention:
- crates/rmg-benches/benches/motion_throughput.rs — verify transaction boundaries, assertion logic (Applied | NoMatch), and that batched measurement isolates setup vs measured code correctly.
- Cargo.toml [profile.bench] — ensure bench profile settings are intentional and compatible with CI/tooling.
- deny.toml — confirm modernization preserves intended license policy semantics.

Possibly related PRs

PR-01: tests(core): golden motion fixtures (JSON) + harness #90 — Adds benchmark code exercising the same motion rule and engine APIs (shares MOTION_RULE_NAME usage); likely overlaps in test design.
Issue Snapshot hash microbench #43 (Snapshot hash microbench) — related intent to add microbenchmarks for compute_state_root; aligns with benchmarking infrastructure expansion.
Issue Scheduler drain microbench #44 (Scheduler drain microbench) — related intent to bench scheduler drain; may reuse the new benches crate layout.

Poem

⚙️ Benchmarks wake, the demo wakes too,

Entities march in measured queue,
Criterion counts each tiny spin,
Cargo profiles tune the bin,
Docs record the run — now let the tests ensue.

Pre-merge checks and finishing touches

❌ Failed checks (3 warnings)

Check name	Status	Explanation	Resolution
Linked Issues Check	⚠️ Warning		Either the linked issues should be removed and replaced with issue #42 (the actual issue this PR closes), or the PR should be extended to implement the specific benchmarks and CI tooling requirements outlined in #43–#46. Clarify whether these future-work issues were incorrectly included or whether this PR is expected to deliver all of this work in a single changeset.
Out of Scope Changes Check	⚠️ Warning	The deny.toml changes represent a cargo-deny configuration modernization (removing deprecated keys, adding explanatory comments for cargo-deny >= 0.14.21) that extends beyond benches-crate scaffolding into workspace-wide build configuration cleanup. While the PR description mentions "CI hotfix: cargo-deny (benches)" suggesting a fix for benches-specific compliance, the actual deny.toml changes are a deliberate policy modernization affecting the entire workspace, not a targeted fix required for the new crate to function. This diverges from the PR's stated benches-only, infrastructure-minimal scope and appears to conflate cleanup work with the primary objective.	Extract the deny.toml modernization into a separate PR scoped to cargo-deny policy updates, or provide explicit justification and evidence that these changes are strictly necessary for the rmg-benches crate to pass CI. If the modernization is required, update the PR description to explicitly call out workspace-wide configuration changes as in-scope and reframe expectations accordingly.
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	You can run `@coderabbitai generate docstrings` to improve docstring coverage.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Title Check	✅ Passed	The PR title "feat(benches): PR-11 — rmg-benches crate skeleton (Closes #42)" uses a conventional commit format and directly captures the primary change: establishing a new benches crate skeleton as the foundation for performance benchmarking infrastructure. The title is specific, concise, and accurately reflects the main objective of adding the crates/rmg-benches directory with Criterion-based benchmarking support, avoiding vague phrasing while remaining appropriately scoped to the core contribution.
Description Check	✅ Passed	The PR description is well-structured with clear Why/What/How/Scope sections that meaningfully relate to the changeset. It explains the strategic intent (establishing infrastructure for M1 Golden Tests), itemizes the concrete changes (new crate, workspace updates, documentation entries), provides actionable run instructions, and explicitly clarifies scope limitations by noting runtime-neutral benches-only work. The description contains sufficient detail without being verbose and demonstrates genuine connection to the code changes.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

Cargo.toml (1)
15-19: Tune benchmark profile for speed (not size).

Release profile is size‑optimized; benches should use a speed‑optimized profile to produce stable, representative numbers.

Apply:
 resolver = "2"

 [workspace.dependencies]
 rmg-core = { version = "0.1.0", path = "crates/rmg-core" }

 [profile.release]
 opt-level = "s"
 lto = true
 codegen-units = 1
 strip = true
+
+[profile.bench]
+# Derive from release but optimize for throughput and faster builds
+opt-level = 3
+lto = false
+codegen-units = 16
+debug = true

📜 Review details

Configuration used: CodeRabbit UI

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between deda508 and 31b2845.

⛔ Files ignored due to path filters (1)

Cargo.lock is excluded by !**/*.lock

📒 Files selected for processing (7)

Cargo.toml (1 hunks)
crates/rmg-benches/Cargo.toml (1 hunks)
crates/rmg-benches/benches/motion_throughput.rs (1 hunks)
docs/ROADMAP.md (1 hunks)
docs/decision-log.md (1 hunks)
docs/echo-total.md (3 hunks)
docs/execution-plan.md (1 hunks)

🧰 Additional context used

🧬 Code graph analysis (1)

crates/rmg-benches/benches/motion_throughput.rs (3)

crates/rmg-core/src/payload.rs (2)

decode_motion_payload (33-44)

encode_motion_payload (17-23)

crates/rmg-core/src/ident.rs (2)

make_node_id (35-40)

make_type_id (27-32)

crates/rmg-core/src/demo/motion.rs (1)

build_motion_demo_engine (110-127)

🪛 GitHub Actions: Docs Rollup Check

docs/echo-total.md

[error] 1-1: docs/echo-total.md is out of date. Run 'make echo-total' (or scripts/gen-echo-total.sh) and commit the result.

🔇 Additional comments (6)

crates/rmg-benches/Cargo.toml (1)
7-13: Criterion features: confirm harness support; consider re-enabling defaults.

With default-features = false, Criterion’s cargo_bench_support may be disabled, which can break criterion_main!/criterion_group! on some toolchains. If you hit build/link errors, drop the override or add the needed feature explicitly.

If compilation fails, apply this change:
-criterion = { version = "0.5", default-features = false, features = ["html_reports"] }
+criterion = { version = "0.5", features = ["html_reports"] }
docs/execution-plan.md (1)

312-315: LGTM — benches entry is clear and scoped.

docs/ROADMAP.md (2)

9-29: LGTM — milestone targets read well and align with perf plan.

34-60: LGTM — issue table refresh note is helpful.

docs/decision-log.md (1)

150-155: LGTM — decision entry documents scope and impact crisply.

docs/echo-total.md (1)

539-544: Manual verification required; script generation failed in sandbox.

The rollup regeneration script (scripts/gen-echo-total.sh) fails in this environment due to bash process substitution limitations (/dev/fd/63 unsupported in sandbox). The output is truncated (~4000 lines removed, including ISSUES_MATRIX.md and ROADMAP.md sections), though those source files exist in the repo.

Cannot definitively verify whether running make echo-total locally will produce correct results. Developer must:

Run make echo-total (or scripts/gen-echo-total.sh) in your local environment

Verify the regenerated rollup includes all expected sections (confirm ISSUES_MATRIX.md and ROADMAP.md are present)

Ensure the diff reflects intentional updates, not truncation

Commit if correct, or investigate script breakage if truncated

crates/rmg-benches/benches/motion_throughput.rs

…stic-rollup-check

…\n- Precompute NodeId in setup; no hashing in measured loop\n- Use debug_assert! in hot path\n- Decode + validate payload and black_box result\n- Set Criterion group sample_size(50) and warm_up_time(3s)\n- Add [profile.bench] tuned for throughput\n\nRefs PR-11

…d detection

coderabbitai

Actionable comments posted: 5

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

docs/echo-total.md (1)
226-234: Regenerate docs/echo-total.md before merge — file is auto-generated and stale per CI.

This file is explicitly marked as auto-generated (This file is generated by scripts/gen-echo-total.sh). Do not hand-edit; run make echo-total, stage the output, and re-push.

Verify locally:
make echo-total
git diff -- docs/echo-total.md
Also applies to: 547-551, 707-712, 737-749

♻️ Duplicate comments (1)

crates/rmg-benches/benches/motion_throughput.rs (1)

1-8: Nice: stabilized bench harness and side-effect retention.

Precomputing NodeId, using debug_assert! in the hot path, and black_box for decoded payloads address prior stability concerns.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 31b2845 and 4ebdf8d.

📒 Files selected for processing (7)

Cargo.toml (2 hunks)
crates/rmg-benches/Cargo.toml (1 hunks)
crates/rmg-benches/benches/motion_throughput.rs (1 hunks)
deny.toml (1 hunks)
docs/decision-log.md (2 hunks)
docs/echo-total.md (5 hunks)
docs/execution-plan.md (2 hunks)

🧰 Additional context used

🧬 Code graph analysis (1)

crates/rmg-benches/benches/motion_throughput.rs (3)

crates/rmg-core/src/payload.rs (2)

decode_motion_payload (33-44)

encode_motion_payload (17-23)

crates/rmg-core/src/ident.rs (2)

make_node_id (35-40)

make_type_id (27-32)

crates/rmg-core/src/demo/motion.rs (1)

build_motion_demo_engine (110-127)

🪛 GitHub Actions: Docs Rollup Check

docs/echo-total.md

[error] 1-1: docs/echo-total.md is out of date. Run 'make echo-total' (or scripts/gen-echo-total.sh) and commit the result.

🔇 Additional comments (6)

crates/rmg-benches/Cargo.toml (2)
9-11: Pin or confirm Criterion version wrt cargo-deny wildcards.

With bans.wildcards = "deny", confirm that criterion = "0.5" won’t trip bans. If CI flags it, pin exact patch (e.g., "0.5.x") for reproducible benches.

If bans complain, apply:
-criterion = { version = "0.5", default-features = false, features = ["html_reports"] }
+criterion = { version = "0.5.1", default-features = false, features = ["html_reports"] }
13-15: Good: bench harness wiring.

[[bench]] ... harness = false is correct for Criterion; path+version pin on rmg-core satisfies deny policy.
deny.toml (1)

19-25: Validate cargo-deny version and SPDX identifiers.

Ensure CI uses cargo-deny ≥ 0.14.21 so these deprecations are recognized.

Double-check Unicode-3.0 is a valid SPDX ID; many projects use Unicode-DFS-2016. Replace if deny reports “unknown license id.”

docs/execution-plan.md (1)

36-43: Scope and CI notes read clean.

Benches-only scope and deny modernization are clearly documented; matches workspace changes.

Cargo.toml (1)

7-9: Workspace membership updated correctly.

docs/decision-log.md (1)

150-155: Decision entry matches code changes.

PR-11 benches crate rationale and consequences are crisp and aligned.

Cargo.toml

crates/rmg-benches/benches/motion_throughput.rs

docs/decision-log.md

docs/execution-plan.md

… noise_threshold; update README Benchmarks; regen docs rollup

flyingrobots added 2 commits November 2, 2025 01:23

docs(roadmap): add explicit success criteria per milestone and clarif…

0ee6523

…y refresh cadence; regen rollup

chore(benches): allow missing docs in bench crate and fix unused mut;…

31b2845

… fmt via hook

coderabbitai bot requested changes Nov 2, 2025

View reviewed changes

crates/rmg-benches/benches/motion_throughput.rs Outdated Show resolved Hide resolved

crates/rmg-benches/benches/motion_throughput.rs Show resolved Hide resolved

flyingrobots added 3 commits November 2, 2025 01:57

Merge remote-tracking branch 'origin/main' into echo/hotfix-determini…

c4a69e0

…stic-rollup-check

ci(deny): benches license + no wildcard; docs rollup regen

d69f8ea

ci(deny): modernize deny.toml; docs rollup regen

b942ba8

coderabbitai bot previously approved these changes Nov 2, 2025

View reviewed changes

flyingrobots dismissed coderabbitai[bot]’s stale review via 0852b28 November 2, 2025 11:34

deny(bans): pin rmg-core version in benches dev-deps to avoid wildcar…

4ebdf8d

…d detection

coderabbitai bot requested changes Nov 2, 2025

View reviewed changes

coderabbitai bot previously approved these changes Nov 2, 2025

View reviewed changes

bench: split into build-only and apply+commit; add measurement_time +…

d58e1de

… noise_threshold; update README Benchmarks; regen docs rollup

flyingrobots dismissed coderabbitai[bot]’s stale review via d58e1de November 2, 2025 12:43

docs: add CI note to Decision Log; regen rollup

73c4d9c

flyingrobots merged commit 0430c47 into main Nov 2, 2025
11 of 12 checks passed

flyingrobots deleted the echo/hotfix-deterministic-rollup-check branch November 2, 2025 12:47

This was referenced Nov 2, 2025

feat(benches): PR-12 — snapshot hash microbench (Closes #43) #113

Merged

rmg-core/scheduler: fail-fast drain; u32 radix histogram; keep PendingTx private #121

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(benches): PR-11 — rmg-benches crate skeleton (Closes #42) #112

feat(benches): PR-11 — rmg-benches crate skeleton (Closes #42) #112

Uh oh!

flyingrobots commented Nov 2, 2025

Uh oh!

coderabbitai bot commented Nov 2, 2025 •

edited

Loading

Rate limit exceeded

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

feat(benches): PR-11 — rmg-benches crate skeleton (Closes #42) #112

feat(benches): PR-11 — rmg-benches crate skeleton (Closes #42) #112

Uh oh!

Conversation

flyingrobots commented Nov 2, 2025

Uh oh!

coderabbitai bot commented Nov 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Rate limit exceeded

Summary by CodeRabbit

Release Notes

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Poem

Pre-merge checks and finishing touches

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

coderabbitai bot commented Nov 2, 2025 •

edited

Loading